Overview

Dataset statistics

Number of variables22
Number of observations15532
Missing cells7364
Missing cells (%)2.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 MiB
Average record size in memory176.0 B

Variable types

Numeric10
Categorical12

Alerts

ClaimInd has constant value "1" Constant
RecordBeg has a high cardinality: 349 distinct values High cardinality
RecordEnd has a high cardinality: 360 distinct values High cardinality
df_index is highly correlated with DatasetHigh correlation
LicAge is highly correlated with DrivAge and 1 other fieldsHigh correlation
DrivAge is highly correlated with LicAge and 1 other fieldsHigh correlation
BonusMalus is highly correlated with LicAge and 1 other fieldsHigh correlation
Dataset is highly correlated with df_indexHigh correlation
df_index is highly correlated with DatasetHigh correlation
LicAge is highly correlated with DrivAge and 1 other fieldsHigh correlation
DrivAge is highly correlated with LicAge and 1 other fieldsHigh correlation
BonusMalus is highly correlated with LicAge and 1 other fieldsHigh correlation
Dataset is highly correlated with df_indexHigh correlation
df_index is highly correlated with DatasetHigh correlation
LicAge is highly correlated with DrivAge and 1 other fieldsHigh correlation
DrivAge is highly correlated with LicAgeHigh correlation
BonusMalus is highly correlated with LicAgeHigh correlation
Dataset is highly correlated with df_indexHigh correlation
Gender is highly correlated with ClaimIndHigh correlation
ClaimNbFireTheft is highly correlated with ClaimIndHigh correlation
SocioCateg is highly correlated with ClaimIndHigh correlation
Dataset is highly correlated with ClaimIndHigh correlation
ClaimNbParking is highly correlated with ClaimIndHigh correlation
ClaimNbResp is highly correlated with ClaimIndHigh correlation
HasKmLimit is highly correlated with ClaimIndHigh correlation
MariStat is highly correlated with ClaimIndHigh correlation
VehUsage is highly correlated with ClaimIndHigh correlation
ClaimInd is highly correlated with Gender and 8 other fieldsHigh correlation
df_index is highly correlated with DatasetHigh correlation
LicAge is highly correlated with MariStat and 3 other fieldsHigh correlation
MariStat is highly correlated with LicAge and 1 other fieldsHigh correlation
SocioCateg is highly correlated with LicAge and 2 other fieldsHigh correlation
VehUsage is highly correlated with SocioCateg and 1 other fieldsHigh correlation
DrivAge is highly correlated with LicAge and 4 other fieldsHigh correlation
BonusMalus is highly correlated with LicAge and 2 other fieldsHigh correlation
Dataset is highly correlated with df_indexHigh correlation
ClaimNbResp is highly correlated with BonusMalusHigh correlation
RecordEnd has 7364 (47.4%) missing values Missing
ClaimAmount is highly skewed (γ1 = 62.22157794) Skewed
df_index has unique values Unique
LicAge has 406 (2.6%) zeros Zeros
ClaimNbNonResp has 10762 (69.3%) zeros Zeros
ClaimNbWindscreen has 10238 (65.9%) zeros Zeros
OutUseNb has 12547 (80.8%) zeros Zeros

Reproduction

Analysis started2021-11-15 17:01:13.122275
Analysis finished2021-11-15 17:01:58.967834
Duration45.85 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct15532
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean226187.6985
Minimum145813
Maximum310976
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:01:59.572996image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum145813
5-th percentile153949.5
Q1186848
median225879.5
Q3265452.25
95-th percentile300420.9
Maximum310976
Range165163
Interquartile range (IQR)78604.25

Descriptive statistics

Standard deviation46312.61788
Coefficient of variation (CV)0.2047530356
Kurtosis-1.145728219
Mean226187.6985
Median Absolute Deviation (MAD)39321.5
Skewness0.03950834991
Sum3513147333
Variance2144858574
MonotonicityStrictly increasing
2021-11-15T21:01:59.960911image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1458131
 
< 0.1%
2519731
 
< 0.1%
2516431
 
< 0.1%
2516481
 
< 0.1%
2516501
 
< 0.1%
2516731
 
< 0.1%
2516941
 
< 0.1%
2517251
 
< 0.1%
2517301
 
< 0.1%
2517441
 
< 0.1%
Other values (15522)15522
99.9%
ValueCountFrequency (%)
1458131
< 0.1%
1458141
< 0.1%
1458331
< 0.1%
1458451
< 0.1%
1458461
< 0.1%
1458501
< 0.1%
1458631
< 0.1%
1458661
< 0.1%
1458831
< 0.1%
1458991
< 0.1%
ValueCountFrequency (%)
3109761
< 0.1%
3109731
< 0.1%
3109671
< 0.1%
3109631
< 0.1%
3109101
< 0.1%
3108991
< 0.1%
3108841
< 0.1%
3108801
< 0.1%
3108781
< 0.1%
3108621
< 0.1%

Exposure
Real number (ℝ≥0)

Distinct739
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5966357842
Minimum0.002
Maximum1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:02:00.529877image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0.002
5-th percentile0.143
Q10.408
median0.606
Q30.833
95-th percentile1
Maximum1
Range0.998
Interquartile range (IQR)0.425

Descriptive statistics

Standard deviation0.2638269567
Coefficient of variation (CV)0.442190971
Kurtosis-0.9637876155
Mean0.5966357842
Median Absolute Deviation (MAD)0.213
Skewness-0.2398052898
Sum9266.947
Variance0.0696046631
MonotonicityNot monotonic
2021-11-15T21:02:01.077932image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11046
 
6.7%
0.833843
 
5.4%
0.916810
 
5.2%
0.666648
 
4.2%
0.749638
 
4.1%
0.583556
 
3.6%
0.5481
 
3.1%
0.416468
 
3.0%
0.75458
 
2.9%
0.499439
 
2.8%
Other values (729)9145
58.9%
ValueCountFrequency (%)
0.0025
< 0.1%
0.0053
 
< 0.1%
0.0087
< 0.1%
0.0091
 
< 0.1%
0.014
< 0.1%
0.0133
 
< 0.1%
0.0142
 
< 0.1%
0.0163
 
< 0.1%
0.0195
< 0.1%
0.0218
0.1%
ValueCountFrequency (%)
11046
6.7%
0.99816
 
0.1%
0.99712
 
0.1%
0.99617
 
0.1%
0.9946
 
< 0.1%
0.9934
 
< 0.1%
0.9916
 
< 0.1%
0.9911
 
0.1%
0.9898
 
0.1%
0.98712
 
0.1%

LicAge
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct18
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.401300541
Minimum0
Maximum17
Zeros406
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:02:01.473588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median5
Q38
95-th percentile11
Maximum17
Range17
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.046795481
Coefficient of variation (CV)0.5640855305
Kurtosis-0.6134681806
Mean5.401300541
Median Absolute Deviation (MAD)2
Skewness0.2667288125
Sum83893
Variance9.282962702
MonotonicityNot monotonic
2021-11-15T21:02:01.909122image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
51820
11.7%
21642
10.6%
81632
10.5%
41608
10.4%
31548
10.0%
71547
10.0%
61536
9.9%
11215
7.8%
91047
6.7%
10724
 
4.7%
Other values (8)1213
7.8%
ValueCountFrequency (%)
0406
 
2.6%
11215
7.8%
21642
10.6%
31548
10.0%
41608
10.4%
51820
11.7%
61536
9.9%
71547
10.0%
81632
10.5%
91047
6.7%
ValueCountFrequency (%)
171
 
< 0.1%
164
 
< 0.1%
1518
 
0.1%
1434
 
0.2%
1380
 
0.5%
12217
 
1.4%
11453
 
2.9%
10724
4.7%
91047
6.7%
81632
10.5%

RecordBeg
Categorical

HIGH CARDINALITY

Distinct349
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
2004-01-01
6854 
2004-04-01
837 
2004-03-01
 
608
2004-02-01
 
548
2004-07-01
 
516
Other values (344)
6169 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.1%

Sample

1st row2004-05-19
2nd row2004-01-01
3rd row2004-10-23
4th row2004-01-01
5th row2004-01-01

Common Values

ValueCountFrequency (%)
2004-01-016854
44.1%
2004-04-01837
 
5.4%
2004-03-01608
 
3.9%
2004-02-01548
 
3.5%
2004-07-01516
 
3.3%
2004-06-01458
 
2.9%
2004-05-01423
 
2.7%
2004-09-01261
 
1.7%
2004-10-01242
 
1.6%
2004-08-01219
 
1.4%
Other values (339)4566
29.4%

Length

2021-11-15T21:02:02.300046image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2004-01-016854
44.1%
2004-04-01837
 
5.4%
2004-03-01608
 
3.9%
2004-02-01548
 
3.5%
2004-07-01516
 
3.3%
2004-06-01458
 
2.9%
2004-05-01423
 
2.7%
2004-09-01261
 
1.7%
2004-10-01242
 
1.6%
2004-08-01219
 
1.4%
Other values (339)4566
29.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

RecordEnd
Categorical

HIGH CARDINALITY
MISSING

Distinct360
Distinct (%)4.4%
Missing7364
Missing (%)47.4%
Memory size121.5 KiB
2004-12-01
587 
2004-10-01
566 
2004-07-01
 
522
2004-11-01
 
479
2004-09-01
 
397
Other values (355)
5617 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)0.1%

Sample

1st row2004-10-05
2nd row2004-11-01
3rd row2004-11-01
4th row2004-12-16
5th row2004-06-25

Common Values

ValueCountFrequency (%)
2004-12-01587
 
3.8%
2004-10-01566
 
3.6%
2004-07-01522
 
3.4%
2004-11-01479
 
3.1%
2004-09-01397
 
2.6%
2004-06-01285
 
1.8%
2004-04-01250
 
1.6%
2004-08-01233
 
1.5%
2004-05-01205
 
1.3%
2004-03-01115
 
0.7%
Other values (350)4529
29.2%
(Missing)7364
47.4%

Length

2021-11-15T21:02:02.759673image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2004-12-01587
 
7.2%
2004-10-01566
 
6.9%
2004-07-01522
 
6.4%
2004-11-01479
 
5.9%
2004-09-01397
 
4.9%
2004-06-01285
 
3.5%
2004-04-01250
 
3.1%
2004-08-01233
 
2.9%
2004-05-01205
 
2.5%
2004-03-01115
 
1.4%
Other values (350)4529
55.4%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Gender
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
0
9625 
1
5907 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
09625
62.0%
15907
38.0%

Length

2021-11-15T21:02:03.217630image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:03.525548image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
09625
62.0%
15907
38.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

MariStat
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
0
12994 
1
2538 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
012994
83.7%
12538
 
16.3%

Length

2021-11-15T21:02:03.763114image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:04.064712image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
012994
83.7%
12538
 
16.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

SocioCateg
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
CSP5
10450 
CSP6
2780 
CSP4
1271 
CSP2
 
443
CSP1
 
428
Other values (2)
 
160

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowCSP6
2nd rowCSP5
3rd rowCSP5
4th rowCSP6
5th rowCSP5

Common Values

ValueCountFrequency (%)
CSP510450
67.3%
CSP62780
 
17.9%
CSP41271
 
8.2%
CSP2443
 
2.9%
CSP1428
 
2.8%
CSP3159
 
1.0%
CSP71
 
< 0.1%

Length

2021-11-15T21:02:04.314241image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:04.762872image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
csp510450
67.3%
csp62780
 
17.9%
csp41271
 
8.2%
csp2443
 
2.9%
csp1428
 
2.8%
csp3159
 
1.0%
csp71
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

VehUsage
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
Private+trip to office
8421 
Private
4264 
Professional
2438 
Professional run
 
409

Length

Max length22
Median length22
Mean length16.15439093
Min length7

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPrivate
2nd rowPrivate+trip to office
3rd rowPrivate+trip to office
4th rowPrivate
5th rowPrivate+trip to office

Common Values

ValueCountFrequency (%)
Private+trip to office8421
54.2%
Private4264
27.5%
Professional2438
 
15.7%
Professional run409
 
2.6%

Length

2021-11-15T21:02:05.447604image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:05.777244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
private+trip8421
25.7%
to8421
25.7%
office8421
25.7%
private4264
13.0%
professional2847
 
8.7%
run409
 
1.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

DrivAge
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct56
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.87574041
Minimum20
Maximum75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:02:06.070246image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile25
Q135
median46
Q357
95-th percentile73
Maximum75
Range55
Interquartile range (IQR)22

Descriptive statistics

Standard deviation14.25982527
Coefficient of variation (CV)0.3042048007
Kurtosis-0.895597215
Mean46.87574041
Median Absolute Deviation (MAD)11
Skewness0.184724032
Sum728074
Variance203.3426166
MonotonicityNot monotonic
2021-11-15T21:02:06.606758image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75594
 
3.8%
54420
 
2.7%
56410
 
2.6%
51389
 
2.5%
55373
 
2.4%
57371
 
2.4%
38371
 
2.4%
53370
 
2.4%
41365
 
2.3%
40361
 
2.3%
Other values (46)11508
74.1%
ValueCountFrequency (%)
2035
 
0.2%
2174
 
0.5%
22131
 
0.8%
23156
1.0%
24179
1.2%
25224
1.4%
26265
1.7%
27296
1.9%
28280
1.8%
29352
2.3%
ValueCountFrequency (%)
75594
3.8%
74105
 
0.7%
73118
 
0.8%
72103
 
0.7%
71125
 
0.8%
70126
 
0.8%
69129
 
0.8%
68182
 
1.2%
67157
 
1.0%
66175
 
1.1%

HasKmLimit
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
0
14411 
1
 
1121

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
014411
92.8%
11121
 
7.2%

Length

2021-11-15T21:02:07.112127image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:07.405226image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
014411
92.8%
11121
 
7.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

BonusMalus
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct77
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean61.27105331
Minimum50
Maximum183
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:02:07.658477image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum50
5-th percentile50
Q150
median50
Q371
95-th percentile95
Maximum183
Range133
Interquartile range (IQR)21

Descriptive statistics

Standard deviation16.75145044
Coefficient of variation (CV)0.2733990936
Kurtosis3.116495996
Mean61.27105331
Median Absolute Deviation (MAD)0
Skewness1.693147958
Sum951662
Variance280.6110917
MonotonicityNot monotonic
2021-11-15T21:02:08.362473image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
508219
52.9%
80624
 
4.0%
90589
 
3.8%
76533
 
3.4%
85522
 
3.4%
72499
 
3.2%
68441
 
2.8%
57403
 
2.6%
64395
 
2.5%
60394
 
2.5%
Other values (67)2913
 
18.8%
ValueCountFrequency (%)
508219
52.9%
51270
 
1.7%
52117
 
0.8%
5387
 
0.6%
54333
 
2.1%
55152
 
1.0%
5684
 
0.5%
57403
 
2.6%
58153
 
1.0%
5975
 
0.5%
ValueCountFrequency (%)
1831
 
< 0.1%
1751
 
< 0.1%
1651
 
< 0.1%
1567
< 0.1%
1489
0.1%
14713
0.1%
1462
 
< 0.1%
1432
 
< 0.1%
14011
0.1%
1395
 
< 0.1%

ClaimAmount
Real number (ℝ≥0)

SKEWED

Distinct8624
Distinct (%)55.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2129.25981
Minimum0.1885196375
Maximum802620.271
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:02:08.899649image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0.1885196375
5-th percentile93.97173716
Q1323.3730363
median781.1782477
Q32163.26284
95-th percentile6914.46435
Maximum802620.271
Range802620.0825
Interquartile range (IQR)1839.889804

Descriptive statistics

Standard deviation10287.30525
Coefficient of variation (CV)4.831399719
Kurtosis4738.381136
Mean2129.25981
Median Absolute Deviation (MAD)637.4320242
Skewness62.22157794
Sum33071663.37
Variance105828649.3
MonotonicityNot monotonic
2021-11-15T21:02:09.442533image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1418.610272603
 
3.9%
102.5900302466
 
3.0%
4326.52568280
 
1.8%
97.32326284197
 
1.3%
1562.356495190
 
1.2%
2764.16918497
 
0.6%
2163.2628490
 
0.6%
4241.69184374
 
0.5%
1531.72205450
 
0.3%
1093.41389743
 
0.3%
Other values (8614)13442
86.5%
ValueCountFrequency (%)
0.18851963751
 
< 0.1%
0.23564954681
 
< 0.1%
1.1782477346
< 0.1%
1.7673716011
 
< 0.1%
4.1238670691
 
< 0.1%
4.7836858012
 
< 0.1%
4.9132930511
 
< 0.1%
6.3154078551
 
< 0.1%
8.2830815711
 
< 0.1%
8.8368580062
 
< 0.1%
ValueCountFrequency (%)
802620.2712
< 0.1%
163427.01391
< 0.1%
154957.18012
< 0.1%
139031.39462
< 0.1%
119887.97952
< 0.1%
95150.962842
< 0.1%
84434.811481
< 0.1%
58086.529312
< 0.1%
55699.823562
< 0.1%
52976.598491
< 0.1%

ClaimInd
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
1
15532 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
115532
100.0%

Length

2021-11-15T21:02:09.962811image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:10.257744image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
115532
100.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Dataset
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
6
4214 
7
3639 
8
3219 
5
2418 
9
2042 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5
2nd row5
3rd row5
4th row5
5th row5

Common Values

ValueCountFrequency (%)
64214
27.1%
73639
23.4%
83219
20.7%
52418
15.6%
92042
13.1%

Length

2021-11-15T21:02:10.484071image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:10.785247image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
64214
27.1%
73639
23.4%
83219
20.7%
52418
15.6%
92042
13.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

ClaimNbResp
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
0.0
11619 
1.0
3269 
2.0
 
571
3.0
 
63
4.0
 
10

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.011619
74.8%
1.03269
 
21.0%
2.0571
 
3.7%
3.063
 
0.4%
4.010
 
0.1%

Length

2021-11-15T21:02:11.050354image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:11.361323image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0.011619
74.8%
1.03269
 
21.0%
2.0571
 
3.7%
3.063
 
0.4%
4.010
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

ClaimNbNonResp
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3819855782
Minimum0
Maximum7
Zeros10762
Zeros (%)69.3%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:02:11.592664image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum7
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.6458676619
Coefficient of variation (CV)1.690816876
Kurtosis4.579587282
Mean0.3819855782
Median Absolute Deviation (MAD)0
Skewness1.890375626
Sum5933
Variance0.4171450367
MonotonicityNot monotonic
2021-11-15T21:02:11.989074image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
010762
69.3%
13803
 
24.5%
2807
 
5.2%
3132
 
0.8%
422
 
0.1%
55
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
010762
69.3%
13803
 
24.5%
2807
 
5.2%
3132
 
0.8%
422
 
0.1%
55
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
71
 
< 0.1%
55
 
< 0.1%
422
 
0.1%
3132
 
0.8%
2807
 
5.2%
13803
 
24.5%
010762
69.3%

ClaimNbParking
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
0.0
14191 
1.0
 
1218
2.0
 
115
3.0
 
8

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.014191
91.4%
1.01218
 
7.8%
2.0115
 
0.7%
3.08
 
0.1%

Length

2021-11-15T21:02:12.452213image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:12.752716image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0.014191
91.4%
1.01218
 
7.8%
2.0115
 
0.7%
3.08
 
0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

ClaimNbFireTheft
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size121.5 KiB
0.0
14296 
1.0
 
1145
2.0
 
90
4.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.014296
92.0%
1.01145
 
7.4%
2.090
 
0.6%
4.01
 
< 0.1%

Length

2021-11-15T21:02:13.004675image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-11-15T21:02:13.300851image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0.014296
92.0%
1.01145
 
7.4%
2.090
 
0.6%
4.01
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

ClaimNbWindscreen
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4363893896
Minimum0
Maximum5
Zeros10238
Zeros (%)65.9%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:02:13.499253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.6911115883
Coefficient of variation (CV)1.583703923
Kurtosis3.384159201
Mean0.4363893896
Median Absolute Deviation (MAD)0
Skewness1.720527933
Sum6778
Variance0.4776352274
MonotonicityNot monotonic
2021-11-15T21:02:13.907532image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
010238
65.9%
14071
 
26.2%
21010
 
6.5%
3171
 
1.1%
436
 
0.2%
56
 
< 0.1%
ValueCountFrequency (%)
010238
65.9%
14071
 
26.2%
21010
 
6.5%
3171
 
1.1%
436
 
0.2%
56
 
< 0.1%
ValueCountFrequency (%)
56
 
< 0.1%
436
 
0.2%
3171
 
1.1%
21010
 
6.5%
14071
 
26.2%
010238
65.9%

OutUseNb
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3109708988
Minimum0
Maximum5
Zeros12547
Zeros (%)80.8%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:02:14.296298image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.7638578956
Coefficient of variation (CV)2.456364562
Kurtosis10.53932642
Mean0.3109708988
Median Absolute Deviation (MAD)0
Skewness3.076759748
Sum4830
Variance0.5834788847
MonotonicityNot monotonic
2021-11-15T21:02:14.723138image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
012547
80.8%
11869
 
12.0%
2631
 
4.1%
3290
 
1.9%
4146
 
0.9%
549
 
0.3%
ValueCountFrequency (%)
012547
80.8%
11869
 
12.0%
2631
 
4.1%
3290
 
1.9%
4146
 
0.9%
549
 
0.3%
ValueCountFrequency (%)
549
 
0.3%
4146
 
0.9%
3290
 
1.9%
2631
 
4.1%
11869
 
12.0%
012547
80.8%

RiskArea
Real number (ℝ≥0)

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.88301571
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size121.5 KiB
2021-11-15T21:02:15.119905image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median8
Q310
95-th percentile11
Maximum13
Range12
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.239167609
Coefficient of variation (CV)0.284049619
Kurtosis-0.6169234101
Mean7.88301571
Median Absolute Deviation (MAD)2
Skewness-0.3378824716
Sum122439
Variance5.013871582
MonotonicityNot monotonic
2021-11-15T21:02:15.570663image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
73343
21.5%
102909
18.7%
62164
13.9%
92143
13.8%
111953
12.6%
51033
 
6.7%
8838
 
5.4%
4613
 
3.9%
3268
 
1.7%
2207
 
1.3%
Other values (3)61
 
0.4%
ValueCountFrequency (%)
19
 
0.1%
2207
 
1.3%
3268
 
1.7%
4613
 
3.9%
51033
 
6.7%
62164
13.9%
73343
21.5%
8838
 
5.4%
92143
13.8%
102909
18.7%
ValueCountFrequency (%)
1322
 
0.1%
1230
 
0.2%
111953
12.6%
102909
18.7%
92143
13.8%
8838
 
5.4%
73343
21.5%
62164
13.9%
51033
 
6.7%
4613
 
3.9%

Interactions

2021-11-15T21:01:51.492562image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:20.235601image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:23.515172image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:26.903564image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:30.068153image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:32.983367image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:36.827441image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:40.257298image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:43.903678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:47.418401image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:51.903265image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:20.827161image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:23.899098image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:27.204907image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:30.374472image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:33.295783image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:37.146466image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:40.679564image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:44.201981image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:47.827794image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:52.308975image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:21.094470image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:24.277439image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:27.502475image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:30.648705image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:33.714616image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:37.469311image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:41.082276image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:44.508861image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:48.224204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:52.729208image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:21.394517image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:24.621734image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:27.803628image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:30.940140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:34.124433image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:37.962014image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:41.505180image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:44.816307image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:48.639039image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:53.124850image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:21.670618image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:24.900866image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:28.087284image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:31.216620image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:34.520653image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:38.276815image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:41.906357image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:45.112432image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:49.038835image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:53.532137image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:21.958981image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:25.246610image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:28.402931image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:31.517510image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:34.920002image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:38.645308image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:42.320567image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:45.425178image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:49.445891image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:53.959945image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:22.278906image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:25.611656image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:28.721206image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:31.823461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:35.340845image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:38.978116image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:42.694785image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:45.761275image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:49.870992image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:54.382972image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:22.583985image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:26.004423image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:29.194185image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:32.115456image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:35.750590image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:39.308178image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:43.004100image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:46.062780image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:50.293468image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:54.783513image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:22.865166image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:26.341225image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:29.487140image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:32.410875image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:36.143911image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:39.622390image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:43.307253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:46.408216image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:50.693467image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:55.166196image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:23.150156image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:26.628084image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:29.772832image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:32.705927image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:36.542447image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:39.930716image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:43.607272image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:47.025572image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-11-15T21:01:51.091767image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2021-11-15T21:02:15.996351image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-11-15T21:02:16.704479image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-11-15T21:02:17.568630image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-11-15T21:02:18.223334image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-11-15T21:02:18.796680image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-11-15T21:01:55.926746image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-11-15T21:01:57.482983image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-11-15T21:01:58.169325image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexExposureLicAgeRecordBegRecordEndGenderMariStatSocioCategVehUsageDrivAgeHasKmLimitBonusMalusClaimAmountClaimIndDatasetClaimNbRespClaimNbNonRespClaimNbParkingClaimNbFireTheftClaimNbWindscreenOutUseNbRiskArea
01458130.62112004-05-19NaN00CSP6Private680505377.20151.000.001.000.001.000.004.00
11458140.7662004-01-012004-10-0511CSP5Private+trip to office470502017.84150.001.000.000.000.000.006.00
21458330.0252004-10-232004-11-0100CSP5Private+trip to office49050356.77150.001.000.000.000.002.008.00
31458450.83102004-01-012004-11-0100CSP6Private75050645.13150.000.000.000.000.000.009.00
41458460.9652004-01-012004-12-1610CSP5Private+trip to office490541200.42150.000.000.000.001.000.008.00
51458500.61102004-05-21NaN00CSP6Private681504326.53150.000.000.000.000.000.0010.00
61458630.4862004-01-012004-06-2510CSP1Professional480642667.09150.000.001.000.001.000.009.00
71458661.0052004-01-012004-12-3100CSP4Professional41060386.71150.000.000.000.000.000.006.00
81458830.1082004-01-012004-02-0601CSP4Professional570502020.64150.001.000.000.000.000.0011.00
91458990.82112004-03-07NaN00CSP6Private71050467.73150.000.001.000.000.000.007.00

Last rows

df_indexExposureLicAgeRecordBegRecordEndGenderMariStatSocioCategVehUsageDrivAgeHasKmLimitBonusMalusClaimAmountClaimIndDatasetClaimNbRespClaimNbNonRespClaimNbParkingClaimNbFireTheftClaimNbWindscreenOutUseNbRiskArea
155223108620.5182004-02-272004-09-0110CSP5Private+trip to office55050845.46190.002.000.000.000.000.0010.00
155233108780.83132004-03-01NaN00CSP6Private75050351.32190.000.000.000.000.001.007.00
155243108800.9242004-02-01NaN10CSP5Private+trip to office37064776.04190.000.000.000.001.002.007.00
155253108841.0082004-01-02NaN00CSP5Private+trip to office550502981.70190.001.001.000.001.000.0010.00
155263108990.2352004-04-022004-06-2500CSP4Private+trip to office49050277.19191.000.000.000.001.002.0010.00
155273109100.3392004-01-012004-05-0100CSP6Private70150230.74190.001.000.000.000.000.009.00
155283109630.1872004-10-25NaN10CSP5Private691621562.36192.000.000.000.000.000.007.00
155293109670.7592004-01-012004-10-0100CSP6Private63050476.32190.001.000.000.001.000.006.00
155303109730.4262004-02-282004-07-3000CSP5Private+trip to office530501117.89190.000.000.000.000.000.007.00
155313109761.0072004-01-01NaN10CSP5Private+trip to office540502764.17190.000.000.000.001.000.007.00